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Abstract 

Background: Skeletal dysplasias are a rare and heterogeneous group of genetic disorders affecting skeletal 
development. Patients with skeletal dysplasias suffer from many complex medical issues including degenerative 
joint disease and neurological complications. Because the data and expertise associated with this field is both 
sparse and disparate, significant benefits will potentially accrue from the availability of an ontology that provides a 
shared conceptualisation of the domain knowledge and enables data integration, cross-referencing and advanced 
reasoning across the relevant but distributed data sources. 

Results: We introduce the design considerations and implementation details of the Bone Dysplasia Ontology. We 
also describe the different components of the ontology, including a comprehensive and formal representation of 
the skeletal dysplasia domain as well as the related genotypes and phenotypes. We then briefly describe 
SKELETOME, a community-driven knowledge curation platform that is underpinned by the Bone Dysplasia 
Ontology. SKELETOME enables domain experts to use, refine and extend and apply the ontology without any prior 
ontology engineering experience-to advance the body of knowledge in the skeletal dysplasia field. 

Conclusions: The Bone Dysplasia Ontology represents the most comprehensive structured knowledge source for 
the skeletal dysplasias domain. It provides the means for integrating and annotating clinical and research data, not 
only at the generic domain knowledge level, but also at the level of individual patient case studies. It enables links 
between individual cases and publicly available genotype and phenotype resources based on a community-driven 
curation process that ensures a shared conceptualisation of the domain knowledge and its continuous incremental 
evolution. 



Background 

Skeletal dysplasias are a heterogeneous group of genetic 
disorders affecting skeletal development. There are cur- 
rently over 450 recognised types, clustered in 40 groups. 
Patients with skeletal dysplasias have complex medical 
issues including short stature, degenerative joint disease, 
scoliosis and neurological complications. These patients 
are also a precious resource for biomedical research as 
they enable scientists to study the effects of single genes 
on human bone and cartilage development and function. 
The resulting insights lead to a better understanding of 
the pathogenesis of more common connective tissue dis- 
orders such as arthritis or osteoporosis. 
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Despite their importance, bone dysplasias are not 
exploited to their full potential in biomedical research. 
Since most conditions are rare (< 1:10'000 births) and 
correct diagnosis is difficult, only a few medical centres 
worldwide have expertise in diagnosis and management 
of these disorders. On the other hand, the identification 
of many skeletal dysplasia genes and subsequent studies 
of their functions and interactions have led to an explo- 
sion of knowledge about bone and cartilage biology. The 
biomedical literature now contains a large amount of 
information about individual genes and gene interac- 
tions [1], but it is often difficult to grasp how these 
interactions work together in a broader context, such as 
at the growth plate. In turn, the focus on specific cases 
or genes makes it difficult to identify etiological relation- 
ships between skeletal dysplasias, or to recognise clinical 
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or radiological characteristics that are indicative of 
defects associated with specific molecular pathways. 

The International Skeletal Dysplasia Society http:// 
www.isds.ch/ has attempted to address some of these 
problems with its Nosology of Genetic Skeletal Disor- 
ders. Since 1972, the ISDS Nosology lists all recognised 
skeletal dysplasias and groups them by common clini- 
cal-radiographic characteristics and/or molecular disease 
mechanisms. The ISDS Nosology is revised every 4 
years by an expert committee and the updated version 
is published in a medical journal. The latest version is 
from 2010 and is presented in [2]. The ISDS Nosology 
is widely accepted as the "official" nomenclature for ske- 
letal dysplasias within the biomedical community. 

While the content of the Nosology is invaluable, the 
format of the Nosology has several shortcomings. Firstly, 
the classification scheme is inflexible, each disorder is 
listed in one group, based either on its clinical radio- 
graphic appearance or on its underlying molecular 
genetic mechanism (many disorders can be associated 
with multiple groups). Secondly, very limited informa- 
tion is listed for each entry. Current information is lim- 
ited to: the OMIM [3] number, the chromosome locus, 
gene name and protein name. In other words, the 
Nosology is not linked to freely available and widely 
used online repositories such as UniProt [4], limiting 
users' ability to further study the disorders. Thirdly, the 
Nosology associates diseases with specific genes but pro- 
vides no additional information on the responsible gene 
mutations. Fourthly, phenotypic and clinical-radio- 
graphic information is present intrinsically in the classi- 
fication, but not explicitly in the Nosology. Finally, due 
to its current publishing process, the content quickly 
becomes outdated, as genes or disorders discovered 
after the publication date cannot be included until the 
next revision (4 years later). For example, shortly after 
the publication of the newest version of the ISDS Nosol- 
ogy, Gray et al. [5] have shown that the Serpentine 
fibula polycystic kidney syndrome (SFPKS) is charac- 
terised by truncating mutations in NOTCH2, and conse- 
quently have proposed the move of SFPKS from the 
Filamin Group to the Osteolysis Group, due to its 
genetic similarities with the Hajdu-Cheney syndrome. 
Unfortunately, this information will be reflected in the 
Nosology only in four years time. 

Over the past 10 years, ontologies have proven to 
represent a practical solution to data integration and 
knowledge acquisition, processing and management, 
particularly in the Healthcare and Life Sciences [6]. 
Their use in automated annotation [7,8] or cross-linking 
for query and retrieval purposes [9,10] is now broadly 
recognised in the biomedical field. As a result of their 
wide adoption and in order to enable collaboration and 
cross-fertilisation, several ontology repositories and 



collections have been created. The Open Biomedical 
Ontologies Foundry http://www.obofoundry.org/Ill] 
represents the most prominent collaborative collection 
of biomedical ontologies, while the NCBO BioPortal 
http://bioportal.bioontology.org/I12] is currently the 
most comprehensive ontology repository in this domain. 
Ontologies hosted in or linked from these two access 
points vary widely in size (ranging from several hun- 
dreds to hundreds of thousands of concepts) and 
domain (from imaging methods to cell behaviour or 
clinical terminology). While an extensive number of bio- 
medical topics have been covered, there remain topics 
where more comprehensive documentation is required. 
One such topic is the skeletal dysplasia domain. 

The Bone Dysplasia Ontology aims to complement the 
spectrum of existing ontologies and address the specific 
knowledge representation shortcomings of the ISDS 
Nosology. Its main role is to provide the scaffolding 
required for a comprehensive, accurate and formal 
representation of the genotypes and phenotypes involved 
in skeletal dysplasias, together with their specific and 
disease-oriented constraints. As opposed to the current 
ISDS Nosology, the ontology enables a shared concep- 
tual model, formalised in a machine-understandable 
description, in addition to a continuous evolution and a 
foundational building block for facilitating knowledge 
extraction and reasoning. The symbiosis between the 
ontology and the community-driven knowledge curation 
platform built to support its evolution enables collabora- 
tive and incremental acquisition and encoding of 
advances by the experts in the field. Ultimately, it 
underpins mechanisms for sharing and re-use of data 
and information and advanced reasoning techniques for 
semi-automated diagnosis or disease features extraction. 

Methods 

The Bone Dysplasia Ontology has been built collabora- 
tively by a team of experts in skeletal dysplasias and 
ontology engineering. The design of the ontology was 
heavily influenced by the need to address the limitations 
of the ISDS Nosology, or more concretely, the need to 
capture the wealth of intrinsic knowledge of the domain 
described in diverse case studies or publications. Hence, 
the main purpose of the ontology coincides with its 
implicit role of providing a shared conceptualisation of 
the domain, and is not necessarily dependent on specific 
use cases. The community-driven knowledge curation 
platform built to support the ontology (described later 
in the article) enables a knowledge engineering cycle 
that combines a sustainable ontology evolution and 
quality-oriented process (enforced by the editorial roles 
embraced by the experts in the community) with the 
direct use of the ontology for semantic annotation of 
clinical summaries and collaborative decision-making. 
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The aim of the ontology is to support the evolving 
knowledge in the skeletal dysplasia domain by providing 
a formal foundation to be used by the community to 
continuously update the classification of the disease 
concepts, thus improving to the current publishing cycle 
practice. A second, yet equally important goal, is to 
bridge the phenotype and genotype information charac- 
terising the diseases, in order to build a comprehensive 
body of knowledge from the existing and emerging 
patient reports. Consequently, the three important pil- 
lars of the domain, as depicted in Figure 1, have been 
mapped to the top level classes of the ontology, i.e., 
Bone Dysplasia, Gene Mutation, Gene and Phenoty- 
pic Composite. The phenotype information is also cap- 
tured by adopting concepts from external ontologies. 

In order to avoid ambiguous interpretation and to 
enable compatibility between Bone Dysplasia Ontology 
concepts and concepts from other ontologies, the top 
level classes are rooted in entities defined by the Basic 
Formal Ontology (BFO, http://www.ifomis.org/bfoI13], 
and where possible by the Ontology of General Medical 
Science (OGMS) [14]-a middle ontology rooted in BFO, 
which provides a specific framework for medicine, to be 
extended by specialised ontologies. The concept map- 
pings are listed in the following: (i) Bone Dysplasia 
represents a ogms:OGMS_0000047 (Genetic disorder), 



(ii) Gene is a snap:MaterialEntity; (iii) Gene Mutation 
represents snap:SpecificallyDependentContinuant(s) 

as every gene mutation is specific to a particular gene; 
(iv) Phenotypic Composite represents a ogms: 
OGMS_0000023 (Phenotype). 

The Bone Dysplasia, Gene and Protein terms were 
manually extracted from the 2010 Revision of the ISDS 
Nosology [2], Gene classes were also augmented with 
references to external resources, such as MeSH http:// 
www.nlm.nih.gov/mesh/, OMIM or Uniprot. Gene 
Mutation descriptions were designed according to the 
Mutation Nomenclature of the Human Genome Varia- 
tion Society [15], to capture the offset of the mutation 
and the original and mutated content. For example, 
GLY380ARG, 1138 G-A has a NCI: Missense Muta- 
tion type attached, an offset of 1138, count 1, original 
content G and mutated content A. 

In recent years, phenotype ontologies have been seen 
as an invaluable source of information, which can enrich 
and advance evolutionary and genetic databases [16]. 
One of the pioneering example, and currently the most 
comprehensive source of such information is the 
Human Phenotype Ontology (HP) [17]. We imported 
concepts from HP to augment the intrinsic skeletal dys- 
plasia genetic information with phenotypic descriptions. 
However, as noted by [18], most of the terms in HP 
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Figure 1 The Bone Dysplasia Ontology. Generic overview of the Bone Dysplasia ontology structure. The high level boxes represent the three 
pillars of the domain, i.e., bone dysplasias, phenotype and genotype information. The dotted boxes in the Phenotype information pillar denote 
relationships-based dependencies on particular concepts from those ontologies and not subsumption. 



Groza et al. BMC Bioinformatics 2012, 13:50 
http://www.biomedcentral.eom/1 471 -2 1 05/1 3/50 



Page 4 of 1 3 



implicitly combine anatomical entities with qualities. For 
example, Mitral valve prolapse (HP:HP_0001634) can 
be decomposed into the anatomical entity Mitral valve 
and the quality prolapsed. As a result, in order to cap- 
ture information currently not covered by HP, but by 
also taking into account the aforementioned distinction, 
our top level concept Phenotypic Composite enables 
the composition of an Anatomical entity, concept 
imported from the Foundational Model of Anatomy 
Ontology [19] or an Anatomical Composite, concept 
we introduce to model partonomies of anatomical enti- 
ties, and a Physical Object Quality, concept imported 
the Phenotype and Trait Ontology (PATO) [18] or a 
Quality Composite, a concept we define to capture 
conjunctions of qualities and qualifiers (e.g., mildly 
bowed). The complete structure of the Phenotypic 
Composite can be seen in Figure 2. Qualities may also 
have measurement units attached via concepts imported 
from Units of Measurement ontology (UO, http://purl. 
org/obo/owl/UO). Finally, additional phenotypic infor- 
mation, with an accent on clinical radiographic features, 
has been foreseen via the import of the Abnormality 
concept of the Dynamic Radiological Electronic Atlas of 
Malformation Syndromes ontology (dREAMS, http://d- 
reams.org/ ?page_id = 84) . 

The current import of all external concepts followed 
the minimum information to reference an external 
ontology term (MIREOT) guidelines [20]. The platform 
sustaining the evolution of the ontology will ensure that 
the import of any additional external concepts will 
respect the same guidelines. 



The structure of the ontology is a directed acyclic 
graph (DAG) based on taxonomical relations and using 
the rdfs:subClassOf construct. All classes have fully qua- 
lified URIs, while the human-readable description is pro- 
vided via the rdfsdabel property. Alternative definitions 
(e.g., synonyms or acronyms), in addition to references 
to external entities are defined by existing or custom 
OWL annotation properties, such as skos:altLabel, chro- 
mosomal_locus, uniprot_id, omim_no or mesh_id. The 
metadata describing the ontology is represented using 
the DublinCore vocabulary http://purl.org/dc/terms/ and 
its defined properties: dc-.title, dcxreator, dexontributor- 
and dc:publisher. 

We have formalised the ontology using OWL-DL [21], 
one of the three sub-languages of the Web Ontology 
Language (OWL) because it provides a maximum 
expressiveness without losing computational complete- 
ness. OWL-DL defines constructs that enable: (i) boo- 
lean combinations of class expressions (such as union or 
intersection, required to integrate diverse vocabularies 
for describing the phenotype information); (ii) as well as 
disjointness and equivalence class axioms; and (iii) arbi- 
trary cardinality restrictions. Furthermore, the sublan- 
guage has also developed a wide range of mature 
reasoners, which makes it an ideal candidate for real- 
world practical applications. 

From a pragmatical perspective, we opted for using a 
logical formalism, because only a well-structured, logical 
representation framework is able to encode the relations 
existing between phenotypic and genotypic characteris- 
tics in the context of particular bone dysplasias. The 




Figure 2 Connecting the bone dysplasias to phenotype information. Dotted circles represent concepts from external ontologies. The 
direction of the arrows have the same meaning as in Figure 3. 
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resulting class axioms not only encode properly the con- 
ceptual real-world knowledge of the domain (e.g., 
Achondoplasia is characterized by a mutation in gene 
FGFR3 and by Hydrocephalus or Lumbar hyperlordo- 
sis), but also enable us to use this conceptual knowledge 
to perform reasoning on patient instance data. 

The Bone Dysplasia Ontology was curated manually 
using the Stanford Protege-OWL 4.1 http://protege.stan- 
ford.edu/ ontology editor. For reasoning purposes, the 
ontology imports (via owhimports statements) the 
Human Phenotype Ontology and the Phenotype and 
Trait Ontology, but also specific concepts from different 
other ontologies, as specified above and further 
described in the following section. The consistency 
checking has been performed by running the OWL-DL 
Pellet v2.1.2 [22] and Hermit vl.3.3 [23] reasoners over 
the ontology, to analyse both the class and object prop- 
erty definitions. 

Results and discussion 

This section details the classes defined by the Bone Dys- 
plasia Ontology and the class axioms and relations that 
we have introduced in order to accurately model the 
existing knowledge in the domain. It also discusses the 
availability of the ontology and our envisioned revision 
and extension cycle. 

The Bone Dysplasia Ontology classes 

The structure of the ontology is conceptually built 
around three main knowledge pillars: bone dysplasias, 
genotype information and phenotype information, as 
depicted in Figure 1. The ontology consists of 1228 
own-defined classes, of which 515 define bone dyspla- 
sias, 254 define genes, 361 define gene mutations and 
224 define proteins. 

The skeletal dysplasias component comprises the hier- 
archy of diseases, starting from the Bone Dysplasia 
super-concept which is refined via taxonomical relations 
(i.e., rdfs:subClassOf) to 40 specific groups of diseases 
(e.g., rotect Acromelic Dysplasias, Aggrecan Group or 
Patellar dysostoses) and then to dysplasias defined 
within the groups. Figure 4 presents a small portion of 
the classification. In principle, the hierarchy has two 
levels, i.e., the group level and the leaf level representing 
bone dysplasias, however, there are also cases where the 
depth of the hierarchy is three. Such an example exists 
in the Craniosynostosis syndromes group, where the 
class Pfeiffer syndrome FGFR2-related has two sub- 
classes Antley-Bixler variants caused by FGFR2 muta- 
tions and Jackson- Weiss syndrome. In principle, all 
classes defined at this level, such as the aforementioned 
two, represent diseases maintained only for historical 
purposes. The Nosology mentions them as being sub- 
sumed by some other disorders (via simple 



observations), like the Pfeiffer syndrome FGFR2- 
related in our example, and hence we added them as 
subclasses of the corresponding concepts in the 
hierarchy. 

Similar or equivalent bone dysplasia concepts, includ- 
ing a HP_0002652 {Skeletal dysplasia) super-concept, 
are also defined by the Human Phenotype Ontology. 
However, a correct alignment between these terms and 
the terms defined within our ontology, both from the 
domain and the logical perspectives, cannot be realised 
due to either the vagueness or the improper granularity 
of the concepts. For example, HP defines concepts such 
as HP_0005716 {Lethal skeletaldysplasia) or 
HP_0005685 {Severe skeletal dysplasia), which seem to 
be rather qualities than proper disease definitions. Simi- 
larly, concepts like HP 0002654 {Multiple epiphyseal 
dysplasia), are defined in our ontology at a much more 
fine-grained level via several concepts, e.g., in this case 
via seven classes (see Multiple epiphyseal dysplasia 
and pseudoachondroplasia Group). 

The genotype information pillar captures Gene Muta- 
tion^) and their associated Gene(s) and Proteins (see 
Figure 3). Each of these concepts have a corresponding 
class in the ontology and subsume particular sub-con- 
cepts. Gene Mutation classes are related to Gene 
classes via the has_locus relation. Similarly, Protein 
classes are related to Gene classes via the is_encoded_by 
relation. The naming of the subclasses of these three 
concepts follows an incrementally encoded structure, 
e.g., GM0000001 for a gene mutation, G0000001 for a 
gene, and P0000001 for a protein. However, genes and 
proteinsalso have human readable names provided via 
the rdfsdabel property and synonyms via the skos:altLa- 
bel property. For example, G0000047 has the label 
MNX1 and the alternative label (or synonym) HLXB9. 

Gene mutations are defined according to the Mutation 
Nomenclature of the Human Genome Variation Society 
[15], to capture the offset of the mutation, along with 
the original and mutated content, via four datatype 
properties: original_content, offset, count and mutated_- 
content. The type of the mutation is signalled by the 
mutation_type relation between Gene Mutation and 
NCI:Mutation Abnormality, the latter concept being 
imported in our ontology together with its entire sub- 
structure (see below for a gene mutation example). 
Gene concepts are linked to multiple external resources, 
e.g., OMIM, Uniprot or MeSH via corresponding anno- 
tation properties: omimjno, uniprot_id, mesh_id, umls_- 
cui, ref_seq, entrezgene_id and accesion_no. 

As a side remark, the ontology contains a second 
Gene Mutation class, imported from the NCI thesaurus 
as part of the NCI:Mutation Abnormality sub-tree. 
The BDO and the NCI Gene Mutation classes are not 
equivalent. The BDO Gene Mutation is an entity that 
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Figure 3 Connecting the bone dysplasias to genotype information. Dotted circles represent concepts from external ontologies. The dotted 
box represents annotation properties attached to the Gene concept. The direction of the arrows shows domain-range association in the 
property definition. 



describes actual gene mutations (linked to a Gene and 
having a type, encoding, offset, etc.). The NCI Gene 
Mutation is, in reality, improperly defined because it 
refers to a type of mutation and not to a gene mutation 
per se. This can be easily observed by analysing the con- 
cepts imported from NCI under the NCI:Mutation 
Abnormality super-concept, which describe different 
types of gene mutations. However, since we rely on 
these mutation type concepts and import them accord- 
ing to the MIREOT principle, we were not able to omit 
this particular concept, and hence to avoid confusion. 

The phenotype information (depicted in Figure 2) is 
recorded in a highly extensible manner via the main 
class Phenotypic Composite. The complex nature of 
skeletal dysplasias can be observed in particular in the 
wide range of clinical and radiographic characteristics 
manifested by patients. Consequently, we opted for re- 
using concepts from known ontologies that subsume 
most of the possibly arising phenotype information in 
patient records, e.g., REAMS:Abnormality for radio- 
graphic features and HP:HP_0000118 (Phenotypic 
abnormality) for other phenotypic findings. As discussed 
earlier, our Phenotypic Composite class represents a 
composite element that connects conceptually an otect 
FMA:Anatomical entity or an Anatomical Composite 
to a PATO:Physical Object Quality or a Quality Com- 
posite using the describes and has_quality relations, or 
can build upon existing composites via the OBO has_- 
part relation in addition to connecting a PATO: 



Physical Object Quality or a Quality Composite via 
the has_quality relation. The FMA:Anatomical entity 
and PATO:Physical Object Quality concepts have been 
imported in our ontology, however, choosing particular 
sub-concepts and specialising their relations to particu- 
lar dysplasias is deferred to the community and sup- 
ported by the platform described later in the paper. 
Hence, Phenotypic Composite carries a scaffolding role 
onto which particular elements can be created to com- 
plement the gaps in the current phenotype ontologies. 
Some definition examples are, however, presented both 
in the ontology, as well as in the following section. 

Class axioms and relationships 

Table 1 lists the main relations introduced by the Bone 
Dysplasia Ontology. The mode_of_inheritance relation- 
ship links Bone Dysplasia(s) to diverse modes of inheri- 
tance from the Human Phenotype Ontology (via 
HP 0000005). The mutation_type relation provides a 
connection between Gene Mutation and the NCI: 
Mutation Abnormality that defines all possible gene 
mutation types, while the has_locus relation links Gene 
Mutation to a Gene. Finally, the characterized_by rela- 
tion provides support for associating Bone Dysplasia(s) 
to Phenotypic Composite(s), Gene Mutation(s) or HP: 
HP_0000118 (Phenotypic Abnormality). To these are 
added the two relations mentioned above, i.e., describes 
and has_quality, and the has_anatomical_coordinate 
relation, used to connect a Anatomical Composite to a 
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Figure 4 Bone Dysplasias classification. The main Bone Dysplasia concept is connected to lower level concepts via rdfs:subClassOf relations. 



Table 1 Relations defined in the Bone Dysplasia ontology 



Relation 


Domain 


Range 


characterized_by 


Bone Dysplasia 


Phenotypic Composite, Gene Mutation 






HP:HP_00001 18 (Phenotypic abnormality) 


mode_of_inheritance 


Bone Dysplasia 


HP: HP_0000005 (Mode of inheritance) 


hasjocus 


Gene Mutation 


Gene 


mutation_type 


Gene Mutation 


NChMutation Abnormality 


is_encoded_by 


Protein 


Gene 


describes 


Phenotypic Composite 


Anatomical Composite, FMA:Anatomical_Entity 


has_quality 


Phenotypic Composite 


Quality Composite, 






PATO:PATO_0001 241 (Physical object quality) 


has_qualifier 


Quality Composite 


PATO:PATO_0000068(Qua/tof/ve), 






PATO:PATO_0001 241 (Physical object quality) 


has_anatomical_coordinate 


Anatomical Composite 


FMA:Primary_anatomical_coordinate, 






FMA:Secondary_anatomical_coordinate 
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FMA:Primary anatomical coordinate or FMA:Second- 
ary anatomical coordinate, and the has_qualifier rela- 
tion that enables the attachment of qualifiers from 
PATO:PATO_0000068 {Qualitative) and protect 
PATO:PATO_0001241 {Physical object quality) to 
Quality Composite(s). 

A major aim of the Bone Dysplasia ontology is to 
underpin a community- driven knowledge curation plat- 
form that enables collaborative decision making and 
knowledge exchange among the experts in the field. In 
order to support the decision making process (i.e., colla- 
borative diagnosis), as well as the transfer of knowledge 
from particular patient studies to the generic concept 
definitions, we encoded the semantics of the emerging 
knowledge discoveries in class axioms and restrictions. 
Furthermore, to reflect the current domain knowledge 
about each specific dysplasia accurately, these class 
axioms are specialised at the lower levels of the Bone 
Dysplasia concept with more specific details. As a result, 
more than 70% of actual bone dysplasia concepts are 
linked to gene mutations, and around 80% of the same 
concepts have phenotype information attached (via more 
than 2,000 phenotypes imported from the Human Phe- 
notype Ontology). The lack of class axioms in the rest of 
the bone dysplasia concepts is due, in principle, to two 
factors. From the genetic perspective, the corresponding 
bone dysplasias currently have no established links with 
particular genes, while from the phenotype perspective, 
we were, until now, unable to mine disorder-phenotype 
relations for the corresponding bone dysplasias. 

The class definition of three of the top-level concepts 
(as Gene is an independent material entity) are pre- 
sented below, using the OWL Manchester syntax: 

Class: Bone_Dysplasia 

SubClassOf : 

OGMS : OGMS_0 0 0 0 04 7 
SubClassOf : 

characterized_by only (REAMS : Abnorm- 
ality or HP : HP_000 0118 

or Phenotypic_Composite or Gene_ 

Mutation) 
SubClassOf : 

mode_of _inheritance only HP: 

HP_0000005 
Annotations : 

skos : description "A genetic disorder 

that involves abnormal 

development of bones and connective 

tissues . " 

Definition: Bone Dysplasia is defined as a specialisa- 
tion is defined as a specialisation has two restrictions: (i) 
all concepts that characterise this entity (via 



characterized_by) are Gene_Mutations or Phenotypic_- 
Composites, or REAMS:Abnormality or HP: 
HP 0000118 {Phenotypic abnormality), and (ii) all con- 
cepts providing a mode_of_inheritance for this entity are 
HP:HP_0000005 {Mode of inheritance). 
Class: Gene_Mutation 

SubClassOf : 
SNAP : 

Spec i f i cal lyDependentCont inuant 
SubClassOf : 

has_locus only Gene and has_locus 

some Gene 
SubClassOf : 

mutation_type only NCI : Mutation_ 

Abnormality 

and mutation_type some NCI :Mutation_ 
Abnormality 
Annotations : 

skos : description "A change or altera- 
tion in a gene . " 

Definition: Gene_Mutation is defined as a specialisa- 
tion of an entity that has two restrictions: (i) all con- 
cepts acting as a locus for this entity (via has_locus) are 
Genes and there is at least one such Gene that is the 
locus of this entity, and (ii) all concepts that define the 
mutation_type for this entity are NCI:Mutation_Ab- 
normality and there is at least one such NCI:Mutatio- 
n Abnormality that provides a mutation type. 

Class: Phenotypic_Composite 

SubClassOf : 

OGMS : OGMS_0 00002 3 
SubClassOf : 

(has_part some Phenotypic_Composite 

and has_part only Phenotypic_ 

Composite) 

or 

(describes some FMA : Anatomical_ 
entity 

and describes only FMA : Anatomical_ 

entity) 

or 

(describes some Anatomical_composite 
and describes only Anatomical_ 
composite) 
SubClassOf : 

(has_quality only PATO_0001241 
and has_quality some PATO_0001241) 
or 

(has_quality only Quality_Composite 
and has_quality some Quality_ 
Composite) 
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Annotations : 

skos : description "A continuant 
describing the conjunction 
between a quality and an anatomical 
part or an anatomical 
composite . " 

Definition: Phenotypic_Composite is defined as a spe- 
cialisation of an entity that has two restrictions: (i) the 
entity either has_part some concepts that are all Phenoty- 
pic_Composite and there exist at least one such Phenoty- 
pic_Composite that is a part of the entity, or all concepts 
described by this entity (via describes) are FMA:Anatomi- 
cal entity and there is at least one such FMA:Anatomi- 
calentity that is described by the entity, or all concepts 
described by this entity (via describes) are Anatomical_- 
Composite and there is at least one such Anatomical_- 
Composite that is described by the entity, and (ii) all 
concepts that define a quality for this entity (via has_qual- 
ity) are PATO:PATO_0001241 and there is at least one 
such PATO:PATO_0001241 that provides a quality, or 
all concepts that define a quality forthis entity (via has_qu- 
ality) are Quality_Composite and there is at least one 
such Quality Composite that provides a quality. 

Below, we illustrate a series of concrete concept defi- 
nition examples, for the Achondroplasia and 
GM0000001 classes and two particular Phenotypic 
Composites- Translucency of proximal femur and Oval 
translucency of proximal femur, by showing the some of 
the definition constraints, and in the case of the gene 
mutation, the information captures along the lines of 
the Mutation Nomenclature of the Human Genome 
Variation Society: 

Class: Achondroplasia 

SubClassOf : 

characterized_by only (GM000001 or 
GM000361 

or HP_0000238 or HP_0002938 or 
HP_0002968 or HP_0003505 or . . . ) 
SubClassOf : 

mode_of _inheritance only HP_0000006 
and 

mode_of_inheritance some HP_0000006 
Class: GM000001 

SubClassOf : 

has_locus only GO 0 0 0001 and has_locus 

some GO 0 00 001 
SubClassOf : 

mutation_type only NCI : Missense_Mu- 

tation and 



mutation_type some NCI : Missense_ 
Mutation 
Annotations : 

encoding "GLY380ARG, 1138 G-A" , off- 
set 1138, 

original_content "G" , mutated_con- 
tent "A" 

Class: PC_0000004 

SubClassOf : 

describes only AC_0000001 and 

describes some AC_0000001 
SubClassOf : 

has_quality only PATO : PATO_0 0 0 13 54 

andl 

has_quality some PATO : PATO_0 0 013 54 
Annotations : 

label "Translucency of proximal 
femur" 

skos : description "Translucent proxi- 
mal area of femur" 

Class: AC_0000001 

SubClassOf : 

has_part only FMA: Femur and has_part 

some FMA: Femur 
SubClassOf : 

has_anatomical_coordinate only FMA: 

Proximal and 

has_anatomical_coordinate some FMA: 
Proximal 
Annotations : 

label "Proximal femur" 

skos : description "The proximal area 

of the femur" 

Class: PC_0000005 

SubClassOf : 

describes only AC_0000001 and 

describes some AC_0000001 
SubClassOf : 

has_quality only QC_000 0 001 and has_ 

quality some QC_0 0000 01 
Annotations : 

label "Oval translucency of proximal 

femur" 

skos : description "Oval -shaped trans- 
lucent area of the proximal femur" 

Class: QC_0000001 



Groza ef al. BMC Bioinformatics 2012, 13:50 
http://www.biomedcentral.eom/1 471 -2 1 05/1 3/50 



Page 1 0 of 1 3 



SubClassOf : 

has_part only PATO : PATO_0 0 013 54 and 

hasjart some PATO : PATO_0 0 013 54 
SubClassOf : 

has_qualif ier only PATO : PATO_0 00094 7 

and 

has_qualif ier some PATO : PATO_0 000947 
Annotations : 

label "Oval translucency" 

skos : description "Oval - shaped area 

of translucency" 

Availability 

Table 2 summarises the main characteristics of the Bone 
Dysplasia Ontology. The current release of the ontology 
has the version number 1.5, and the namespace of the 
ontology is http://purl.org/skeletome/bonedysplasia. The 
classification of the bone dysplasias defined in the ontol- 
ogy corresponds to the ISDS Nosology 2010 [2], which 
has only recently been published. The ontology can be 
retrieved directly from the given namespace, or visua- 
lised using the NCBO BioPortal at: http://bioportal. 
bioontology.org/ ontologies/ 1613. 

The design of the ontology aims to re-use and adopt 
existing vocabularies in order to minimize the re-inven- 
tion, duplication and overlap of concepts. Consequently, 
the ontology imports, following the MIREOT guidelines 
[20], a series of concepts from external resources, as pre- 
viously discussed. Additionally, the Gene concepts 
include references to OMIM, Uniprot, MeSH, and UMLS 
via corresponding annotation properties, while the Bone 
Dysplasia concepts refer to OMIM and MeSH. 

Revising and extending the Bone Dysplasia Ontology 

The Bone Dysplasia ontology has been built as a foun- 
dation block for SKELETOME-the skeletal dysplasia 
knowledge curation platform (described in the following 



Table 2 Bone Dysplasia Ontology fact sheet 



Name 


Bone Dysplasia Ontology 


Namespace 


http://purl.org/skeletome/bonedysplasia 


Prefix 


BDO 


Scope 


skeletal dysplasias, genes, proteins, 




gene mutations and phenotypic 




characteristics in human 


Format 


OWL-DL 


Number of classes 


1228 


Dependencies (import) 


HP, PATO, NCI {Gene mutation types) 


Dependencies (weak) 


FMA, REAMS 


Annotations 


rdfs;label, skozaltLabel, uniprotjd 




entrezgenejd, ref_seq, meshjd, locus, 




omim_no, umls_cui, accession_no 




skos:description 



section). As such, support for extensibility is important, 
to cope with the complex and evolving nature of the 
field. Consequently the SKELETOME platform has been 
designed to enable roundtrip knowledge engineering, 
which assumes the evolution of the ontology. New dis- 
coveries emerging from patient studies will be easily 
transferred at the conceptual level by domain experts 
(via class axioms) through extensions to the ontology 
and through additional semantic inference rules, as well 
as at the instance level as new case data becomes avail- 
able. In addition to refined class definitions via specia- 
lised restrictions, the platform allows users with editorial 
roles to alter the bone dysplasia classification, by creat- 
ing or deleting groups, or by moving diseases between 
groups. This leads to a continuous evolution of the 
ontology and inherently of the Nosology and bone dys- 
plasia knowledge. 

Comparison to related ontological resources 

Among the three pillars of the Bone Dysplasia ontology, 
the actual skeletal dysplasia knowledge (representing the 
core of the ontology) is covered only superficially in 
other ontologies and vocabularies. Examples such as the 
Systematized Nomenclature of Medicine-Clinical Terms 
(SNOMED-CT, http://www.ihtsdo.org) [24], REAMS, 
the NCI Thesaurus [25] or the Human Disease Ontol- 
ogy include, as also highlighted in the Background sec- 
tion, only high level concepts denoting the most 
commonly known dysplasias. None of these existing 
related vocabularies attempt to capture related genotype 
or phenotype information. The added value of the Bone 
Dysplasia ontology stands in the comprehensive classifi- 
cation of these disorders, in addition to an accurate 
descriptions (via class axioms and relations) of their 
main genetic and phenotypic characteristics. We regard 
the other ontologies, in particular REAMS, SNOMED 
and the NCI thesaurus, as effective complements and 
important resources to be cross-referenced and re-used 
(to avoid redundancy) to describe the phenotype and 
genotype information of bone dysplasias. 

To date, the integrity of the ontology has been 
ensured by the domain experts-driven curation. Future 
testing of its applicability will be evidenced by the extent 
of its changes over time and the future growth of the 
SKELETOME knowledge base and its associated com- 
munity of users. 

Community-driven knowledge curation 

The increasing use of ontologies in Healthcare and Life 
Sciences has led to novel ways of processing digital con- 
tent, which in turn have introduced new possibilities of 
dealing with scientific publications and data [6]. Such 
content processing techniques make knowledge more 
open and exploitable than ever before [8,26]. 
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Our focus is on ensuring a continuous enrichment 
and evolution of the ontology by transferring knowledge 
present in existing and emerging patient case studies 
into class axioms or cross-references to external pheno- 
type ontology concepts. In order to achieve this, we 
developed the SKELETOME platform http://skeletome. 
metadata.net/skeletome, a community-driven knowledge 
curation platform that enables collaborative input, shar- 
ing and re-use of data and information among experts 
in the skeletal dysplasia domain. SKELETOME provides 
a central access point to a rich skeletal dysplasia knowl- 
edge base, supported by low-level features, such as user 
and group-based access and privacy control. At the 
same time, from a high-level perspective, the anon- 
ymised pool of case studies enables statistical inference 
for knowledge discovery purposes or computer-assisted 
diagnosis. 

SKELETOME is built as a Drupal 7 http://drupal.org/ 
instance, thus inherently providing the collaborative 
aspects, and also allowing us to develop custom modules 
to suit our needs. The Bone Dysplasia Ontology acts as 
the knowledge back-bone of the platform. Each of the 
disease concepts present in the hierarchy of skeletal dys- 
plasias has been imported, via its own module, into the 
platform and has an associated human-readable page. 
The system structure is similar to a knowledge Wiki 
that is built around the ontology. The user-friendly page 
corresponding to each dysplasia, presents a summary of 
the dysplasia and contains pointers to external refer- 
ences. Registered members of the community can add 
facts grounded in scientific publications (similar to the 



OMIM structure) and can discuss facts added by other 
members. Members with editorial role have the ability 
of editing the summary of dysplasias by incorporating 
the facts widely accepted by the community. They are 
also able to alter the bone dysplasia pages by, for exam- 
ple, moving them between groups. Such operations have 
a direct impact on the ontology and are immediately 
reflected in the underlying knowledge base. The contin- 
uous logical correctness of the ontology is always 
enforced by the platform, without the experts noticing 
it. In practice, we have created a round-trip knowledge 
engineering process, driven by the experts in the com- 
munity (only a few experts have editorial roles) who are 
not required to possess any ontology engineering skills. 

The development of the actual knowledge-base about 
bone dysplasias is supported by SKELETOME's knowl- 
edge engineering cycle. On one side, the BDO concepts 
(including also concepts from the imported ontologies) 
are used to annotate patient case studies that can be 
uploaded, analysed and discussed by the members of the 
community. More concretely, the platform enables man- 
ual and automatic semantic annotation of clinical sum- 
maries (see Figure 5), as well as manual annotation of 
X-Ray imagery. In addition to annotation, SKELETOME 
uses the ontology to provide support in the collaborative 
diagnosis process via an underlying decision support 
mechanism, that computes probabilistic correlations of 
phenotypes in the context of a particular disorder, or 
raked list of disorders given particular phenotypes. The 
actual mechanisms perform association rule mining on 
existing patient data and refines the resulting rules 



TITLE 



Clinical summary 



TAGS 



Craniosynostosis. Hypercalciuria. Nephrocalcinosis. Delay* 




rcalciuriaBNephrocalcinosisMDelaved skeletal maturation 



OBSERVATIONS (EDIT SUMMARY) 

B I U «c g 1 1 1 ;E |= - ?* 



She has a combination of cranlasy/jpst.Qsjs, tttftercfllcturia (hut normal blood calcium and FT 
vsds and uousaj hands and feet. She has limited elbow mobility but tWKCDftgtfinsjbjJlty of tr 
limited skeletal survey. I have attached a number of w.ays. but particularly not the coned ejj 
fifth metacarpals. She has quite long tapering fingers with dlgltallsed thumbs. The left thurr 
also has limited flexion. 



Figure 5 Semantic Annotation in SKELETOME. Tagging clinical summaries in SKELETOME using the Bone Dysplasia ontology and references 
to external ontologies. The example shows a clinical summary tagged with terms from the Human Phenotype Ontology and subclasses of the 
Bone Dysplasia concept 
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Achondroplasia 

| View I I Phenotype ] Gene Mutations Analytics 



Distribution of bone dysplasias with correlated phenotypes 




This chart shows the number of phenotypes correlated with certain bone dysplasias. 



Figure 6 Ontology analytics for data exploration. The SKELETOME platform provides facilities for performing ontology analytics, presented to 
the user in form of charts for data exploration purposes. The chart in the figure depicts the distribution of bone dysplasias with correlated 
phenotypes, in the context of the Achondroplasia concept. The chart cursor shows the six overlapping phenotypes with Larsen syndrome 
dominant 



based on the class axioms of the corresponding disor- 
ders before computing the final probabilistic rankings. 
Overall, the patient information is automatically linked, 
via the underlying ontological concepts, to the bone dys- 
plasia concepts and pages. On the other side, from the 
dysplasias perspective, the ontology creates an integrated 
view on the phenotype and genotype emerging from 
patient reports and evolves based on the findings pro- 
vided by the analysis of patient cases combined with the 
current domain knowledge. This is presented to the 
user in form of a ontology analytics service for explora- 
tory purposes and is realised via direct querying on the 
ontology (see Figure 6 for an example). 

The SKELETOME platform and knowledge-base, 
underpinned by the Bone Dysplasia ontology, represents 
an ideal approach by which experts in the skeletal dys- 
plasias domain can collaboratively document, expand 
and maintain a curated body of the knowledge which 
will lead to accelerated innovation and scientific break- 
throughs in their field. 

Conclusions 

The Bone Dysplasia ontology described in this paper, 
represents the most comprehensive structured 



knowledge source for the skeletal dysplasias domain. It 
provides the means for integrating and annotating clini- 
cal and research data, not only at the generic domain 
knowledge level, but also at the level of individual 
patient case studies-by enabling links between indivi- 
dual cases and publicly available genotype and pheno- 
type resources. The community-driven curation process 
ensures a shared conceptualisation of the domain 
knowledge and its continuous incremental evolution. 
Future development of both the ontology and the SKE- 
LETOME platform will focus on advancing the reason- 
ing and knowledge extraction services-which will 
hopefully lead to the discovery of previously unknown 
relationships between gene mutations, phenotype char- 
acteristics and bone dysplasias and the discovery of new 
drugs to combat disorders associated with human bone 
and cartilage development. 
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